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We perform a systematic comparison of the statistical model parametrization of hadron abun- 
dances observed in high energy pp , AA and e + e~ collisions. The basic aim of the study is to test 
if the quality of the description depends on the nature of the collision process. In particular, we 
want to see if nuclear collisions, with multiple initial interactions, lead to "more thermal" average 
multiplicities than elementary pp collisions or e + e~ annihilation. Such a comparison is meaningful 
only if it is based on data for the same or similar hadronic species and if the analyzed data has 
quantitatively similar errors. When these requirements are maintained, the quality of the statistical 
model description is found to be the same for the different initial collision configurations. 



I. INTRODUCTION 

One of the most striking observations in high energy multihadron production is that both species 
abundances and transverse momentum spectra (provided effects of collective flow and gluon radiation 
are removed) follow the thermal pattern of an ideal hadron-resonance gas, with a universal temperature 
T ~ 160 — 170 MeV This behavior was initially attributed to a temperature limit arising from 
an exponentially growing resonance spectrum jp, Q and is today generally taken as a reflection of the 
quark- hadron transition temperature in QCD [J|. 

Thermal multihadron production has been investigated experimentally in a variety of collision processes, 
from e + e~ annihilation and pp -pp collisions to high energy nucleus-nucleus interactions. For sufficiently 
high energies, these studies all led to the same universal hadronic resonance gas temperature, even though 
there were other distinguishing features. In particular, it was observed that in elementary collisions, 
strangeness production suffered an overall suppression, quite likely due to the heavier mass of the strange 
quark. In nuclear interaction, this suppression is less or perhaps completely absent 1 . 

The origin of the success of a thermal picture for such a variety of different collision configurations 
has been an enigma for a long time, extensively discussed in the literature [||. In particular, given a 
large number of possible multi-hadron channels, why does nature always choose to maximize entropy at a 
universal temperature within a finite region? In heavy ion collisions, it is conceivable that this could have 
something to do with the relatively large volume, which makes the system confined for a long enough 
time to allow sufficient inelastic scattering to reach equilibration. However, this cannot account for the 
observation of thermal behaviour in elementary collisions, where such a mechanism cannot play any role; 
also, a hadronic-rescattering thermalization is hardly reconciled with the observation of a centrality- 
independent chemical freeze-out temperature in relativistic heavy- ion collisions [7]. These facts are a 
strong indication that the thermal behaviour is a feature of hadronization itself. One explanation proposed 
for a universal spontaneous thermal hadron emission is that it arises through quantum tunnelling (Unruh 
radiation) at the color event horizon [8[; the Schwinger mechanism [9| is a special case of such Unruh 



It has recently been noted that whatever strangeness suppression remains in heavy ion collisions can be accounted for by 
residual ("corona") single nucleon-nucleon interactions Q. 
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radiation 10]. Among other ideas put forward to explain the mechanism of the apparent thermalization, 
it is worth mentioning quantum chaos and Berry's conjecture fll. [Tlj. still at a speculative stage. 

In some recent studies ; the extent of agreement of data with a thermal description was discussed 

for different processes. In a "thermalization" framework where inelastic collisions play a major role, it 
is suggestive that elementary processes should not lead to as good an agreement to the same, though 
approximated, statistical model formula, as it is found for nuclear collisions, with a sort of hierarchy from 
e + e~ to AA . Such a view would be supported if it was found that the agreement, as specified by e.g. 
X 2 /dof, is simply much better for AA data than it is for that from e + e~ annihilation. This has in fact 
been claimed [131 ]; however, if such comparisons are to be conclusive, several conditions have to be met. 

• First of all, any comparison should be based on essentially the same set of species. Ideally, these 
should be narrow and hence long-lived resonance states, in order to avoid the difficulties encountered 
in determining the rates of broad short-lived states, concerning background separation, feed-down 
and branching ratios. This requirement leaves in general some 10 to 15 states and so still allows a 
significant comparative analysis. 

• Next, if the comparison is to be based on the x 2 /dof for the fits to be considered, the corresponding 
data sets should have similar experimental errors. If the estimated theoretical values are approxi- 
mations, a more accurate data set may imply larger deviations between model and data in units of 
errors, so that a comparison of the x 2 /dof of two different sets makes sense only if their errors are 
comparable. 

• Finally, the basic idea of statistical hadronization can be implemented for specific process in different 
ways. Without decisive further information, a comparison of the quality of the fits remains the only 
tool to judge which scheme is closest to reality. 

We have here emphasized the use of x 2 /dof as a tool to compare different experimental configurations 
as well as different model implementations. The reason for this is, as we shall discuss in more detail in 
section III, that the absolute value of x 2 /dof has to be interpreted with much care. Since the theoretical 
formulae employed in any statistical model fit are only approximations of the full dynamics, deviations 
must appear, as has been mentioned, once the measurements become sufficiently precise, and hence 
X 2 /dof must then become large. This point was already made about 10 years ago [Tj3], when discussing 
the effect of local fluctuations of thermodynamical parameters. 

With these caveats in mind, we have selected three extensive data sets for our comparative analysis. In 
Section II, following a short summary defining the details of the underlying statistical model framework, 
we compare recent hadroproduction data from pp collisions, taken by the STAR experiment at RHIC 
for y/s = 200 GeV [16], to the corresponding results from Au-Au collisions [l?} at the same energy and 
measured by the same experimental group. One very remarkable result of this analysis is, as we shall 
see, that the fit to the pp data is as good as it can possibly be, with a x 2 /dof ~ 1; the corresponding 
Au-Au analysis leads to a less optimal fit. To elucidate the meaning of this result, we discuss in Section 
III more generally the relevant features of comparing the statistical hadronization model to data and 
testing the fits obtained. In Section IV, we then consider e + e~ data from LEP at 91.25 GeV [l8| and 
compare the fits for this to the Au-Au results. In Section V, we consider more generally tests of the 
different implementations of the statistical model; in this context, we also consider possible origins of 
recent apparently contradictory conclusions [l2l [ill] on the thermal description of e + e~ data. 



II. A COMPARISON BETWEEN pp AND AA COLLISIONS 

In this Section we perform a statistical analysis of hadroproduction data at y/s = 200 GeV at RHIC with 
pp and Au-Au as initial collision confi gura tions. The data in both cases are centre-of-mass midrapidity 
densities from the same experiment (l6l.ll7j. For pp collisions, there are 18 species of measured secondaries 
p^| . including several short-lived strange meson and hyperon resonant states; for Au-Au interactions, we 
use a set of 12 rapidity densities of long-lived states at midrapidities [13], already studied in ref. 
with updated experimental errors. The observed abundances are listed in table |U 
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Particle 


Measured dN/dy (E) 


Relative error 


Model dN/dy (M) 


Residual 


(M - E)/E (%) 




pp collisions at ^fs = 


200 GeV 






7T+ 


1.44 ± 0.11 


0.076 


1.403 


-0.34 


-2.62 


TY~ 


1.42 ± 0.11 


0.077 


1.384 


-0.33 


-2.59 


K+ 


0.150 ± 0.013 


0.087 


0.1522 


0.17 


1.48 


K~ 


0.145 ± 0.013 


0.090 


0.1460 


0.076 


0.68 


P 


0.138 ± 0.012 


0.087 


0.1491 


0.92 


7.42 


P 


0.113 ± 0.010 


0.088 


0.1120 


0.66 


5.56 


<i> 


0.0180 ± 0.0029 


0.16 


0.01130 


-2.31 


-59.3 


A 


0.0436 ± 0.0041 


0.094 


0.04348 


-0.030 


-0.28 


A 


0.0398 ± 0.0038 


0.095 


0.03686 


-0.77 


-7.96 


H _ 


0.0026 ± 0.00092 


0.35 


0.003070 


0.51 


15.3 


H + 


0.0029 ± 0.00104 


0.36 


0.002728 


-0.17 


-6.29 


Q + Cl 


0.00034 ± 0.00019 


0.56 


0.0005712 


1.22 


40.5 


K° s 


0.134 ± 0.011 


0.082 


0.1467 


1.15 


8.64 


P° 


0.259 ± 0.039 


0.15 


0.1861 


-1.87 


-39.2 


(K*° + K*°)/2 


0.0508 ± 0.0063 


0.12 


0.05151 


0.11 


1.38 


E*+ + E*" 


0.0107 ± 0.00146 


0.14 


0.01028 


-0.29 


-4.12 


E*+ + E*" 


0.0089 ± 0.00126 


0.14 


0.008650 


-0.20 


-2.89 


A(1520) + A(1520) 


0.0069 ± 0.0011 


0.16 


0.005606 


-1.18 


-23.1 


Au-Au collisions at yfs NN = 200 GeV 


7T+ 


322 ± 25 


0.078 


330.0 


0.32 


2.41 


TY~ 


327 ± 25 


0.077 


331.9 


0.19 


1.46 


K+ 


51.3 ± 6.5 


0.13 


57.65 


0.98 


11.0 


K~ 


49.5 ± 6.2 


0.13 


54.44 


0.80 


9.07 


P 


34.7 ± 4.4 


0.13 


42.23 


1.71 


17.8 


P 


26.7 ± 3.4 


0.13 


31.24 


1.34 


14.5 


A 


16.7 ± 1.12 


0.067 


14.44 


-2.02 


-15.7 


A 


12.7 ± 0.92 


0.072 


11.10 


-1.74 


-14.4 


<t> 


7.95 ± 0.74 


0.093 


6.697 


-1.69 


-18.7 




1.83 ± 0.206 


0.092 


2.024 


-0.73 


-7.20 


w+ 


2.17 ± 0.20 


0.11 


1.676 


-0.75 


-9.16 


Q + Q 


0.53 ± 0.057 


0.11 


0.6529 


2.16 


18.8 



TABLE I: Measured and fitted mid-rapidity densities in pp and Au-Au collisions at 200 GeV; data from STAR 
experiment. For pp collisions, none of the quoted experimental numbers are corrected for weak decay feed-down 
while for Au-Au collisions all multiplicities are feed-down corrected, except protons and antiprotons (Tt| |. 
Our model calculations were carried out accordingly. 



For elementary collisions, the abundance (n,j) of hadron species j is in the statistical hadronization 
model given by (for a detailed description, see ref. [Hj]) 

/ primary _ VT{2Jj + 1) ^ W.n/- c1 Yn+1 K (TH£\ Z (Q ~ nC lj) m 

{nj) ' ~ 2vr 2 (T ' n \ T J Z(Q) ' 1 ' 



n=l 



where the temperature T, the strangeness suppression 7^ and the normalization volume V are taken as 
free parameters; Q = (Q, B, S, . . .) is the array of conserved charges and the corresponding array 
for the jth hadron species. The "chemical" factors Z(Q — nq :/ )/Z(Q) are ratios of partition functions 
and replace the more familiar fugacities when the exact conservation of the initial charges is to be taken 
into account, a typical feature of small systems also known as canonical suppression. To the primary 
production for each species j one then adds the decay products of heavier states, using the experimentally 
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known branching ratios 



in ' ,n '" vx -\"'n k )BR(k -> j). (2) 



Formulae (|T|) and ([2J apply in principle to full phase space multiplicities, since possible charge-momentum 
correlations are integrated out. In order to apply it nevertheless in a comparison of midrapidity data from 
pp and Au-Au collisions at the same energy, we assume that particle ratios at midrapidity are essentially 
the same as particle ratios in full phase space. While such an assumption is certainly not tenable at low 
collision energy, it might improve its validity in high energy collisions with large rapidity coverage. We 
thus assume that the primary rapidity density of each species in pp collisions is given by 

/^primary _ AVT^ + 1) ^ „„ w+1 mj / nrrij \ Z (Q - nqj) 

{ y'y=° ~ 2tt 2 ^ 7s [Tl) n \ T ) Z(Q) ' [ ' 

where A is a common normalization factor taking into account the ratio of production in the mid- 
rapidity interval to the overall rate. Note that A cannot be absorbed into the overall volume V, since 
V still appears as a key parameter in the chemical factors Z(Q — nqj)/Z(Q). Also, it is assumed that 
subsequent decays do not alter the midrapidity densities, i.e. that also formula ([2]) holds. 

In this form, the statistical model for pp collisions leads to a four-parameter fit: besides T, j s and V, 
there is the rapidity-cut related normalization factor A. For heavy ion collisions, we have exactly the 
same number of parameters; because of the large volume, formula © here becomes 

^primary = AVT(2Jj + 1) g ^n^n+l ™£ ^ ^/T (4) 

with the chemical factors now replaced by the fugacities. In this case, the factor A can be absorbed in V 
and the free parameters are T, j s , ps and AV. 

The results of the fits to the midrapidity densities are shown in table U for both pp and Au-Au and 
fig. [TJ the fit parameter values and the corresponding x 2 /dof are listed in table HU The first block of 
this table, labelled "full pp fit" , gives the parameters for the full analysis of all pp species. In the middle 
block of table UH we then compare the pp and Au-Au fits using the same data sample. 

The most striking feature seems to us the high quality of the full pp fit. The resulting x 2 /dof ~ 1 is 
the best value to be hoped for; it is as good as any thermal fit ever made for any high energy collision 
configuration. 2 

Next, it is worth noting that the extracted temperature values are almost identical for pp and Au- 
Au collisions; the js value for the pp data agree with previous pp analyses [U [22J. The error on VT 3 
here is large because this parameter is essentially determined by the chemical factors in cq. 

In spite of the extremely similar values obtained for the hadronization temperature, we thus find that 
the x 2 /dof of the pp fits is a factor two better than that for the Au-Au analysis. A naive conclusion of 
this comparison could then be that the statistical model leads to a better agreement with pp collisions 
than for Au — Au. Can we conclude that pp collisions provide a " more thermal" configuration than heavy 
ion collisions? Before answering this question, a general discussion about the statistical model formulae 
and the meaning of statistical tests is now appropriate. 



III. THEORETICAL MODELS AND X 2 TESTS 



Here we want to discuss more in detail two points that have been mentioned in the Introduction, 
namely the meaning of a % 2 /dof fit given only an approximate theoretical description, and how one can 
compare two fits to such an incomplete input. 



2 A recent analysis [20| of p—p data at a considerably lower energy, yfs = 17 GeV, also finds very good agreement with a 
different implementation of the statistical model. 
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pp = 200 GeV 


Au-Au y/s NN = 200 GeV 


Overall fit 


T(MeV) 
Normalization 
VT 3 

7s 

X 2 /dof 


170.1± 3.5 
0.027± 0.011 

135 ± 60 
0.569± 0.031 

15.6/14 


168.5 ± 4.0 
13.6 ± 0.58 

0.932 ± 0.040 
0.173 ± 0.052 
22.2/8 


Fit with standard sample 


T(MeV) 
Normalization 
VT 3 

7s 

X 2 /dof 


169. 8± 4.2 
0.028± 0.012 

131 ± 60 
0.600± 0.033 

15.0/8 


168.5 ± 4.0 
13.6 ± 0.58 

0.932 ± 0.040 
0.173 ± 0.052 
22.2/8 


Fit with standard sample and same relative errors 


T(MeV) 
Normalization 
VT 3 

7s 
Hb/T 
X 2 /dof 


170.9± 5.2 
0.031± 0.015 

118 ± 63 
0.595± 0.038 

13.4/8 


169.6 ± 4.8 
12.6 ± 0.73 

0.999 ± 0.069 
0.163 ± 0.065 
9.6/8 



TABLE II: Comparison between the statistical model fits in pp and Au-Au collisions at ^/s — 200 GeV. The 
parameter referred to as "Normalization" is A in eq. ((3]) for pp collisions and AVT 3 exp(— 0.7GeV/T) in eq. (|4]) 
for Au-Au collisions. 

In general, the test of a physical model is usually based on a fit of data and a statistical test of this 
fit. Thus a x 2 test tells us how likely it would be to obtain a value larger than the minimum % 2 of 
our fit, provided that the hypothesis, i. e., the model, is correct. Most often, however, the theoretical 
formulae that we take as hypotheses are only an approximate representation of the underlying model or 
theory. In other words, they can be expected to reproduce the data only up to some reasonably small 
deviation. There are many instances of this situation; a simple example is the lowest-order perturbative 
expansion of a differential cross section in high-energy collisions. If measurements are more accurate 
than the estimated deviation, the \ 2 statistical test will obviously fail, indicating that corrections to 
the lowest-order theoretical formula are necessary. Sometimes the theoretical description of the process 
is fully under control and corrections are relatively easy to calculate (as for electroweak processes in 
e + e~ collisions), sometimes they are not. This latter is the case in this work, where we test the statistical 
hadronization model. 

The basic premise of the statistical model is that high energy collisions lead to the formation of 
multiple clusters, emitted sequentially in rapidity and decaying into hadrons according to their relative 
phase space weights. The formulae ([!} and (Q} are specific implementations of this idea, based on 
further additional assumptions besides the basic postulate of the model. In elementary collisions, it is 
assumed that the probability of distributing the conserved charges among the actually produced clusters 
has a special form: for instance, a statistical distribution of charges among the clusters, leading to the 
equivalence with one global cluster (this is assumed in our work here). But other charge distribution 
schemes are obviously conceivable, see ref. [23j ]. and we shall return to this aspect in Section V. It is 
therefore clear that if experimental multiplicities were known to a very good accuracy, discrepancies with 
the predictions or fitted values of formula ([1]) could show up, even if the basic idea of purely statistical 
decays of clusters/fireballs remains true. One would then have to correct ([T]) for effects of a non-statistical 
distribution of charges among the clusters, etc. Unfortunately, the definition of such corrections requires 
a more complete picture of the production process than we presently have. Of course, this does not mean 
that approximate models cannot be disproved. It only means that, as long as corrections to the leading- 
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FIG. 1: Above: fitted vs measured midrapidity densities in pp collisions at *Js = 200 GeV. Below: residual 
distributions. 



order formula are not available, we have to be content with a statement like "the statistical hadronization 
model in its simplest approximation reproduces the data up to 10%" , and we should take deviations from 
formula (p} with great care. 

Above and beyond these model-dependent details, we should also stress that any analytical multiplicity 
formulae, such as ([1} and @ or variations thereof, provide by construction only an approximate descrip- 
tion. The use of these formulae will necessarily lead to deviations for sufficiently precise measurements. 
To illustrate this situation, which is ubiquitous in physics, we recall a familiar example. The spectrum 
of the sunlight measured at the top of the atmosphere is usually fitted to a black body formula, yielding 
a temperature of about 5780°K. The formula is accurate to about 10%, but the fit quality is extremely 
bad, with a huge x 2 (see e.g. ref. |24j). The reason for such discrepancy is that on a microscopic level, 
stars are not perfect radiators; different effects, such as the absorption of light by atoms and/or ions 
blocking part the outward radiation path, surface non- uniformities, etc., all cause deviations from the 
simple black-body spectrum, and these are revealed when the accuracy of the photometric measurement 
is better than about 10%. The effects responsible for these deviations are difficult to embody in an ana- 
lytical formula and can only be studied numerically. Still, the failure of the lowest-order Planck formula 
fit in passing a rigorous statistical x 2 test has not led anyone to the conclusion that the surface of the 
sun is not a thermal system. 

In addition to such theoretical caveats, one must also bear in mind the role of experimental compli- 
cations. In fact, most of the measurements which are used to perform a fit include hidden correlations 
which arc difficult to assess (e.g., multiplicities of particles measured with the same detector). Therefore 
the usual assumption of independent errors entering in the x 2 > which we have retained here throughout, 
is only an approximation, and the actual absolute value of the best x 2 /dof could well be different. 

Hence, for theoretical as well as for experimental reasons, the use of absolute x 2 /dof values to judge the 
quality of statistical model descriptions, as suggested in ref. (T^J, appears to us as not really permissible. 

But even with these caveats in mind, a comparative assessment of fit results in different collision con- 
figurations is still possible. To be specific, if the fit quality to a given formula was better in AA collisions 
than in e + e~ or pp interactions under the same conditions, this could of course mean that the deviations 
observed in e + e~ and pp stem from a genuine failure of the model for this case, rather than from the 
approximate nature of the employed formula. However, for such a comparison to make any sense, it must 
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fulfill the essential prerequisite noted in the Introduction. Thus, if the comparison is to be based on a 
X 2 test for the fits to be considered, the corresponding data sets must have comparable average exper- 
imental errors. It is quite clear that if data sets with largely different accuracy are used, a comparison 
of x 2 /dof values could be misleading. To illustrate: if the data set A (example: hadronic multiplicities 
in pp ) with an average accuracy of 10% yields a fit with x 2 /dof = 1 and the set B (example: hadronic 
rapidity densities in heavy ion collisions) with an average accuracy of 1% yields a fit with x 2 /dof = 3, 
a blind comparison of % 2, s would lead us to the erroneous conclusion that the model can be used to 
describe A, but not B. 

We now return to our comparison of Sect. 2 and the hypothetical conclusion that the statistical model 
works better in pp collisions than in Au-Au collisions. A comparison of x 2 values requires that mea- 
surements have the same experimental accuracy, but this is not the case here: as can be deduced from 
table [I] the relative experimental errors for the same species sets differ, with an average value of about 
18% for the pp data but only 10% for Au-Au . 

A possible way of comparing the quality of the two fits would be to look at the average relative deviation 
between theoretical and experimental values. For each particle species this is shown in the last column 
of table [TJ Indeed, the average deviation of the 12 common species are very similar: 12.5% in pp and 
11.7% in Au-Au. This result is a strong indication that the statistical model indeed yields the same 
quality of agreement in the examined cases. To illustrate this more precisely, we have made new fits with 
the experimental errors rescaled such that the relative errors of the measurements are exactly equal and 
determined by the largest error for each species. For instance, the K + multiplicity in pp collisions is now 
assigned an error equal to its relative error in heavy ion collisions times its multiplicity in pp collisions, 
that is 0.13 x 0.150 = 0.0195, giving it the same relative error as the corresponding measurement in 
Au-Au collisions. Conversely, for the f2 + VL in Au-Au collisions, the measurement is far more accurate 
than that in pp collisions; hence here a larger error is assigned to the heavy ion value in the same manner. 
This procedure artificially enhances the experimental errors in both samples and can thus be used only 
for illustrative purposes, neither to extract the best estimate of the model parameters, nor to make a 
proper statistical test. 

The results of these fits are shown in table U (third block), and they show that the x 2 /dof in Au- 
Au is now slightly better than in pp collisions at the same energy. This reinforces our first assessment 
that the simple statistical model formulae © and (@| agree with the data up to ~ 10% both in pp and 
Au-Au collisions and that none of the examined systems can be claimed to be "more thermal" in this 
respect. 



IV. A COMPARISON BETWEEN e+e" AND AA COLLISIONS 

Here we focus our attention on the largest and the most accurate e + e _ sample, data from LEP at 
^/s = 91.25 GeV; this we compare again to the heavy ion sample from RHIC at y/s NN — 200 GeV, 
and we again choose as common as possible a set of long-lived particles. Since in e + e~ collisions, the 
multiplicities of particle and antiparticle are obviously equal, we now find only 7 common species (tt^, 
K ± 1 A, p, (j>, , O). Therefore we have retained all 12 RHIC measurements (n^, K^, A, A, p, p, <fi, E^, 
fl + fi) and added to the e + e~ data sample 7r°, Kg, and the three long-lived £ states. 

The experimental values used for this comparison are a weighted average of the full phase space 
multiplicities measured by the four LEP experiments; these were also used in our previous analysis |12j| 3 . 
The resulting fit is to be compared to the Au-Au mid-rapidity densities measured by STAR [l?} and 
already used above. For comparison purposes, they are shown again in table Mil together with the 
corresponding e + e~ fit values. 

It should be noted that in the Au-Au collisions at \/~s NN = 200 GeV the experimental relative error 
is, in most cases, larger than for e + e~ collisions at LEP energy; the average error in the e + e~ data is 
5.7%, to be compared to 10% for the heavy ion data. This suggests that the fit will result in a larger 



3 We note that these weighted averages differ slightly from those compiled by the Particle Data Group [2£| . In particular, in 
most cases our errors are slightly smaller than those quoted in ref. 1251 which makes the comparison even more conservative 
with regard to our final conclusions. 
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Measured dA^/dy (E) 


Relative error 


Model dN/dy (M) 


Residual 


(M - E)/E (%) 


e + e collisions at 91.25 GeV 


7V° 


9.61 ± 0.29 


0.030 


9.865 


0.89 


2.6 


7T+ 


8.50 ± 0.10 


0.012 


8.460 


-0.37 


-0.44 


K+ 


1.127 ± 0.026 


0.023 


1.080 


-1.79 


-4.3 


K° s 


1.0376 ± 0.0096 


0.0093 


1.040 


0.25 


0.24 


P 


0.519 ± 0.018 


0.035 


0.5727 


2.98 


9.4 




0.0977 ± 0.0058 


0.059 


0.1179 


3.47 


17.1 


A 


0.1943 ± 0.0038 


0.020 


0.1867 


-1.98 


-4.0 


S+ 


0.0535 ± 0.0052 


0.097 


0.04331 


-1.96 


-23.6 


E° 


0.0389 ± 0.0041 


0.11 


0.04392 


1.22 


11.4 


5T 


0.0410 ± 0.0037 


0.090 


0.03949 


-0.40 


-3.7 




0.01319 ± 0.00050 


0.038 


0.01285 


-0.65 


-2.7 


a 


0.00062 ± 0.00010 


0.16 


0.0009264 


2.55 


33.0 


Au-Au collisions at 200 GeV 


7V+ 


322 ± 25 


0.078 


330.0 


0.32 


2.41 


TY~ 


327 ± 25 


0.077 


331.9 


0.19 


1.46 


K+ 


51.3 ± 6.5 


0.13 


57.65 


0.98 


11.0 


K~ 


49.5 ± 6.2 


0.13 


54.44 


0.80 


9.07 


P 


34.7 ± 4.4 


0.13 


42.23 


1.71 


17.8 


P 


26.7 ± 3.4 


0.13 


31.24 


1.34 


14.5 


A 


16.7 ± 1.12 


0.067 


14.44 


-2.02 


-15.7 


A 


12.7 ± 0.92 


0.072 


11.10 


-1.74 


-14.4 




7.95 ± 0.74 


0.093 


6.697 


-1.69 


-18.7 




1.83 ± 0.206 


0.092 


2.024 


-0.73 


-7.20 


=■+ 


2.17 ± 0.20 


0.11 


1.676 


-0.75 


-9.16 


q + n 


0.53 ± 0.057 


0.11 


0.6529 


2.16 


18.8 



TABLE III: Measured and fitted 4-7T multiplicities in e + e~ collisions at ^fs = 91.25 GeV and mid-rapidity abun- 
dances in Au-Au collisions at 200 GeV. For e + e~ collisions, all quoted experimental numbers (for references see 
ref. [T2|) include weak decays feed-down, while for Au-Au collisions all multiplicities are feed-down corrected but 
protons and antiprotons [17] . Our model calculations were carried out accordingly. 

X 2 /dof value for e + e~ than for Au-Au , and we see in table ITVl that this is indeed the case 4 . Therefore, 
as emphasized above, a direct comparison of the % 2 /dof values to assess the relative quality of the two 
fits would again be misleading. 

As in the comparison between pp and Au-Au data, the average relative deviations between theoretical 
and experimental values are close: 9.4% for e + e~ annihilation at 91.25 GeV and 11.7% for Au-Au 
collisions at 200 GeV. This suggests again that the lowest order statistical model formulation yields the 
same quality of agreement in the two examined cases. To reinforce this, for illustrative purpose, we 
have also here made new fits with rescaled experimental errors, in order to make the relative errors of 
measurements equal in the two samples for each species, in much the same way as for the comparison 
between pp and Au-Au collisions described in Sect. III. For the unmatched particles, we have taken an 
error correspondance tt° — > 7r~, Kg — > K~ , E + — > p, E° — > A, E~ — > 3~, where the first particle belongs 
to the e + e _ sample and the second to heavy ion sample. In fact, the only particles which have a larger 
error in e + e~ than in Au-Au collisions are the Q and S° , whose correspondent in Au-Au was taken as 



4 The x 2 /dof value obtained here is slightly larger than that found in \H ; this is due to the fact, unlike in ref. [12|| . we 
have not included in the x 2 the contribution to errors owing to the uncertainties on masses, widths and branching ratios 
of resonances. This choice is motivated by the need of making the comparison between different collision systems as clean 
as possible. 
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e+e" y/s = 91.25 GeV 


Au-Au y/s NN = 200 GeV 


Fit with the standard samples 


T(MeV) 
Normalization 

7s 

X 2 /dof 


164.7± 0.9 (1.9) 
23.2± 0.57 (1.2) 
0.656± 0.0096 (0.021) 

41.5/9 


168.5 ± 4.0 
13.6 ± 0.58 
0.932 ± 0.040 
0.173 ± 0.052 
22.2/8 


Fit with the standard samples and same relative errors 


T(MeV) 
Normalization 

7s 

X 2 /dof 


168.8± 5.2 
21. 3± 3.4 
0.599± 0.029 

11.0/9 


167.8 ± 4.1 
13.15 ± 0.61 
0.968 ± 0.044 
0.200 ± 0.057 
16.8/8 



TABLE IV: Comparison between fit results in e + e collisions at LEP and Au-Au collisions at RHIC for a fit to a 
sample of 12 long-lived hadronic species. The parameter referred to as "Normalization" is VT 3 for e + e~ collisions 
and AVT 3 exp(-0.7GeV/T) (see eq. ©) for Au-Au collisions. The errors within brackets are the fit errors rescaled 
by v / rW (see 0). 

A. 

The resulting fit parameters are included in table ITVl It is seen that the final x 2 /dof is now lower 
in e + e~ collisions than in Au-Au collisions, although no strong statement can be made on this basis, 
as emphasized at the end of previos section. This exercise only demonstrates that, under equal error 
conditions, the agreement of the data with the statistical model predictions in the form of eq. ((TJ is 
approximately the same in e + e~ and Au-Au collisions. Finally, we note that this result is found to be 
consistently independent of finer details, such as different particle species matching, number of particles, 
etc. 

V. COMPARING DIFFERENT STATISTICAL DESCRIPTIONS OF e+e COLLISIONS 

An interesting question of quite general interest is to what extent a \ 2 test can be used to check if 
a specific model input is correct or not. In other words, we now want to compare the x 2 /dof values 
obtained by fitting a specific data set to different theoretical schemes, rather than a given theoretical 
scheme to different data sets. This is again a well-posed question because it involves only a comparative 
assessment. More specifically, we can fit the data to different implementations of the statistical model, 
some of which are likely to be or are certainly incorrect. If the model indeed reflects the right physics, 
the x 2 /dof of a fit to the data should be consistently larger for "incorrect" versions than for the most 
realistic one. 

The e + e~ data from LEP at 91.25 GeV contains enough species and is sufficiently precise to address 
this issue. Our test set will consists of 15 light-flavored, long-lived particles, having widths less than 10 
MeV: 7T°, 7T^~ , i], rj ', K + , K®, <f>, p, A, S + , E°, E~, S~, and il~ . We now fit the measured abundances 
of these species to different implementations of the statistical model, using as basis eq. ([I]). 

• We fit the abundances to the primary production form only, neglecting all resonance decay contri- 
butions. 

• We fit the abundances to the primary production form only, neglecting all decay contributions 
from resonances of width greater than 10 MeV. In this case, strong decays are neglected, but the 
feed-down of weakly decaying heavy flavor states is included. 

• We fit the abundances correctly taking into account all resonance decays, but we replace exact 
quantum number conservation (canonical suppression) by a grand canonical formulation. This 
means replacing the chemical factors Z(Q — q)/Z(Q) in formula (JXJ) by the fugacities of (@|. 
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The results are summarized in table [VJ we have here denoted the results from our global cluster formu- 
lation with "correct" inclusion of all effects as "full global implementation" . It is evident that the model 
indeed provides us with a x 2 /dof hierarchy: the cruder the implementation, the larger the resulting 
X 2 /dof. 



Fit condition 


T (MeV) 


7s 


X 2 /dof 


No resonance feed-down 


158.7 


0.494 


1110/12 


No strong resonance feed-down 


161.1 


0.537 


391/12 


No canonical suppression 


144.9 


0.690 


131/12 


Full global implementation 


164.8 


0.654 


44/12 



TABLE V: Comparison of parameters for fits of 15 long-lived particles in e + e collisions at ^/s — 91.2 GeV, using 
different model inputs. 



At this point, a specific feature of e + e~ annihilation is worth being emphasized. The annihilation 
process leads to the production of a pair of quarks, and about 40% of e + e~ annihilations produce a 
pair of primary c or b quarks, as predicted by the standard model. These then hadronize into heavy 
flavored hadrons, which in turn predominantly decay into strange hadrons. The secondary production of 
light-flavored particles from heavy flavoured ones is a sizeable fraction of the overall particle production. 
It is thus crucial that the model correctly includes the (non-statistical) fractions of the different primary 
quark pairs, and fits neglecting these (e.g., in ref. [l3[ all except the data set at 91.2 GeV) cannot be 
considered as realistic. 

Besides this aspect, however, there are others which distinguish different implementations of the statis- 
tical model concept. As already indicated in Section III, the distribution pattern of the conserved charges 
(baryon number, electric charge, strangeness, charm, bottom) is an issue to be decided when formulating 
a specific model. If a given overall quantum number (for illustration, consider the electric charge Q) in 
two-jet production is zero, the general production pattern will lead to a superposition of two produced 
jets each having zero charge, a pair with Q = ±1, one with Q = ±2, etc., as schematically illustrated 
in Fig. To specify the model, we have to fix the weights Wi of this superposition, and two particular 
schemes have been introduced for this 12311 . 




FIG. 2: Schematic distribution of conserved quantum numbers for two-jet hadron production 



• The simplest version is to enforce exact conservation of all discrete quantum numbers separately 
for each of the two jets formed in the annihilation, allowing no quantum number exchange between 
the jets; i.e., one puts Wo = 1, Wi = V i > 1. This is denoted as uncorrelated jet scheme. 

• The model we have used here allows clusters with statistically distributed discrete quantum num- 
bers, imposing exact overall conservation laws; i.e, the Wi are distributed over i according to the 
available multicluster phase space for an overall Q = 0. This is denoted as global cluster scheme. 

There is a clear exception, however, to a transfer of quark quantum numbers: the heavy c or b quarks 
cannot be exchanged, since heavy quark production at hadronization is severely suppressed. In fact, it is 
an experimentally well established fact that heavy quark production originates entirely from the primary 
e + e~ annihilation or, as an almost negligible higher order perturbative correction, from a hard gluon. It 
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is also a well established fact that primary heavy quarks show up in open-heavy flavored hadrons without 
reannihilating. 

There seem to be no general physics grounds on which to exclude one or the other of these scenarios. 
However, color neutralization of the clusters created in e + e~ collisions requires, before hadron formation, 
the exchange of one or more quark- antiquark pairs between different clusters, and this in turn implies 
the appearance of non- vanishing integer additive charges for single clusters. Hence already the first study 
comparing the two schemes in fits to LEP e + e~ data at 91.2 GeV obtained for the uncorrelated jet fit 
X 2 /dof values twice as large as for the global scheme [23[ ■ For our present study, the comparison is shown 
in table l*VTl and indicates that % 2 /dof is more than a factor of two larger for the uncorrelated than for 
the global cluster scheme. Particularly, the 4> meson is found to deviate about 8cr from the data, which 
is not the case in the global cluster scheme (see table HTH) ; moreover, the temperature value becomes 
much larger in the uncorrelated jet scheme. The reason for such a behaviour is readily understood: in 
an uncorrelated jet scheme the phenomenon of canonical suppression is enhanced by the requirement of 
strict local conservation of quantum numbers and this drives the fit towards higher values of T and/or 
7s in order to reproduce the multiplicities; on the other hand, the <fi meson is unaffected by canonical 
suppression and the increase of T and/or 75 results in an overestimate of the production of this particle. 



Fit condition 


T (MeV) 


7s 


X 2 /dof 


Uncorrelated jet scheme 
Global cluster scheme 


196.9 ± 1.74 
164.8 ± 0.93 


0.622 ± 0.0096 
0.654 ± 0.0095 


104/12 
44/12 



TABLE VI: Comparison of parameters for fits of 15 long-lived particles in e + e~ collisions at y/s = 91.2 GeV, 
using differen t conserv ation scheme for discrete quantum numbers. The errors within brackets are the fit errors 
rescaled by \J x 2 /dof (see fl^ 1 ). 



The mentioned quantum number distribution among the produced jets is only one of the features to 
be specified for a concrete statistical model analysis code. In addition, different codes involve further 
technical input details, addressing e.g. the little-known decay of heavy resonances, the mass range of 
meson vs. baryon resonances to be included, etc. As a result, there exist several codes, differing in these 
rather technical details. Since one cannot give general physics grounds to judge one as "better" than 
another, the only tool we have is to compare the x 2 /dof they provide for specific data sets. For the 
twelve species "standard" e + e~ data set at 91.2 GeV defined above in Table III, we had obtained a 
X 2 /dof = 41.5/9 ~ 4.6. Using the same code, we found [l2| for a much larger set of 30 species, including 
short-lived resonances, a x 2 /dof = 215/27 ~ 8. This shows that also the choice of the data set affects the 
resulting % 2 /dof; including broad resonances, with all the resulting experimental problems, significantly 
increases % 2 /dof. 

Using a different code, ref. [HI obtained for essentially the same large data set at 91.2 GeV a fit with 
X 2 /dof = 499/28 ~ 17.8, i. e., a value twice that which we had found. We can only conclude that the code 
used in ref. [13[ must invoke physics features not in accord with the data. One such feature was already 
indicated: while we use global cluster scheme, ref. [i~3l ] retains only the first term of the expansion shown in 
Fig. [21 the uncorrelated jet scheme. To look at further details, we consider a comparison proposed in ref. 
[13I ]. Fixing the temperature T = 158 MeV, the strangeness suppression 7 S = 0.8 and the volume V = 30 
fm 3 , a set of species' rates is calculated using the code of [13(, using our code, and using the publicly 
available code THERMUS 26]. The results are shown in table IVTlI At first sight, the output yields 
look fairly similar, and in particular those of our code and those of THERMUS indeed show satisfactory 
agreement, with rather small and fluctuating differences. A second look, however, shows that ref. [l3| 
predicts for almost all species a larger multiplicity than the other two. This could be due to the inclusion 
of a larger number of heavy resonances. However, the relative differences are not uniformly distributed; 
the outstanding deviations are proton and A, which in ref. [l3[ lead to yields which are 1.5 - 1.7 times 
larger than those obtained in the other two codes. These two states are among the most accurately 
measured particles at LEP (3.5 % error for the p, 2 % for the A), and hence a sizeable difference in the 
predicted yields will have a large impact on the final fit. We believe that this illustrates once more our 
main point: it is not the absolute value of % 2 /dof that matters, but rather the average relative deviation 
of fit to data. And here we expect that different codes shall (or should) lead to the same conclusion. 
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Our code 


THERMUS 


Code of ref. [13J 




9.163 


8.29 


11.08 


7T + 


7.763 


7.14 


9.27 


K+ 


0.9953 


0.945 


1.014 


K% 


0.9706 


0.920 


0.976 


V 


1.047 


0.890 


1.090 


P° 


1.084 


1.044 


1.12 


K°* 


0.3126 


0.285 


0.299 


P 


0.2840 


0.334 


0.487 


<f> 


0.1274 


0.132 


0.131 


A 


0.1170 


0.120 


0.182 


£*+ 


0.01512 


0.0158 


0.0197 




0.00862 


0.0095 


0.0101 




0.000342 


0.00340 


0.00384 


n 


0.000541 


0.000585 


0.000625 



TABLE VII: Comparison between the output of our code, of THERMUS HH, and of the code used in ref. [13], 
for a chosen set of species and fixed parameter values. 



VI. CONCLUSIONS 



We have carried out statistical analyses of large multihadron production samples from pp , Au-Au and 
e + e~ interactions, containing the same number of species and, as far as possible, the same kind of species. 
We use RHIC data from STAR (-^/s = 200 GeV) for the first two cases, LEP data averaged over the 
four CERN experiments (y/s = 91.25 GeV) for the last. The main results are summarized in table IVTlTl 
The hadronization temperatures are seen to agree extremely well, and the average deviation between the 
fit abundances and the data values (theory minus experiment/experiment) are around 10% for all three 
configurations. 



Collision 


PP 


Au-Au 


e + e 


Temperature [MeV] 


169.8 ± 4.2 


168.5 ± 4.0 


164.7 ± 0.9 


Average relative deviation data vs. fit [%] 


12.5 


11.7 


9.4 


Average relative error of data [%] 


18 


10 


5.7 


X 2 /dof 


15.0/8 ~ 1.9 


22.2/8 ~ 2.8 


41.5/9 ~ 4.6 



TABLE VIII: Summary of the fit results for a subset of 12 long-lived particles in high energy pp , Au-Au and 
e + e~ collisions. 



On the other hand, the resulting % 2 /dof values are approximately 2 for pp , 3 for Au-Au and above 4 
for e + e~ . We argue that this does not imply a corresponding hierarchy of agreement with a statistical 
description. Since the average deviations of the fitted abundances are essentially the same in the three 
cases, the observed differences in % 2 /dof values are rather a reflection of the relative errors in the three 
experiments, also shown in table IVlTIl To test this, we have rescaled the errors in the comparison pp vs. 
Au-Au and in e + e~ vs. Au-Au, and in both cases, the resulting x 2 /dof values then become comparable. 
We thus conclude that the hadroproduction abundances from high energy pp , Au-Au and e + e~ interac- 
tions agree equally well, i.e, to about 10%, with the best present statistical model parametrization. 
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